Skip to content

feat: screenshot digest response-view — Phase 4#945

Merged
thymikee merged 4 commits into
mainfrom
feat/phase4-screenshot-digest-view
Jun 30, 2026
Merged

feat: screenshot digest response-view — Phase 4#945
thymikee merged 4 commits into
mainfrom
feat/phase4-screenshot-digest-view

Conversation

@thymikee

Copy link
Copy Markdown
Member

What

Adds a screenshot entry to the Phase 4 leveled RESPONSE_VIEWS registry (src/daemon/response-views.ts), extending the pattern introduced for snapshot in #942 (now merged to main; this PR is the next Phase 4 slice).

Digest shape

At responseLevel digest, the screenshot view:

  • keeps the cheap fieldspath (the primary result) and artifacts (the client's image-retrieval handle, grafted on by request finalization), preserved only when present;
  • collapses the token-heavy overlayRefs array — each entry carries ref + optional label + three geometry rects (rect, overlayRect, center) — to a total overlayCount plus the first 12 refs leveled down to { ref, label }. The per-overlay geometry is the token sink that --overlay-refs emits when many nodes are annotated.
// digest
{ "path": "…/screenshot.png", "overlayCount": 2, "overlayRefs": [ { "ref": "e1", "label": "Continue" },  ], "artifacts": [  ] }

default and full return today's data unchanged (same reference). The router only calls a view at digest/full for a registered command, so default and every unregistered command stay byte-identical.

Grounding

The shape is grounded in the real handler return — not invented:

  • result type ScreenshotResultData = { path?: string; overlayRefs?: ScreenshotOverlayRef[] }src/utils/screenshot-result.ts:4
  • ScreenshotOverlayRef = { ref; label?; rect; overlayRect; center }src/kernel/snapshot.ts:125
  • overlayRefs attached to the daemon response data — src/daemon/request-generic-dispatch.ts:328 (data.overlayRefs = overlayRefs)
  • artifacts grafted onto screenshot responses by finalization — src/daemon/request-finalization.ts:82

Invariant

The view is a pure read of the default data; it never mutates its input, and only restructures at digest.

Tests

src/daemon/__tests__/response-views.test.ts — mirrors the snapshot view test: digest collapses overlay geometry to overlayCount + leveled { ref, label } refs while keeping path/artifacts, caps the list at 12 (counting all), default/full passthrough by reference, and tolerates a path-only result with no overlay refs.

Verification

  • tsc --noEmit → 0
  • oxfmt --check + oxlint --deny-warnings → clean
  • fallow audit --base origin/main → no issues
  • vitest run src/daemon/__tests__/response-views.test.ts → 9/9
  • vitest run daemon → 938/938 (98 files)
  • Layering Guard grep → empty

Add a `screenshot` entry to the Phase 4 RESPONSE_VIEWS registry. At
responseLevel `digest`, the view keeps the cheap result fields — the
captured `path` and the `artifacts` retrieval handle — and collapses the
token-heavy `overlayRefs` array (each carrying ref + label + three
geometry rects) to a total `overlayCount` plus the first 12 refs leveled
down to `{ ref, label }`. `default`/`full` return today's shape unchanged,
so unregistered and default-level responses stay byte-identical.
@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown

Size Report

Metric Base Current Diff
JS raw 1.4 MB 1.4 MB +712 B
JS gzip 455.5 kB 455.8 kB +211 B
npm tarball 561.0 kB 561.4 kB +387 B
npm unpacked 2.0 MB 2.0 MB +1.0 kB

Startup median (7 runs, lower is better):

Scenario Base Current Diff
CLI --version 29.1 ms 30.2 ms +1.2 ms
CLI --help 50.1 ms 52.1 ms +2.0 ms

Top changed chunks:

Chunk Raw diff Gzip diff
dist/src/9919.js +131 B +75 B
dist/src/9722.js +294 B +66 B
dist/src/generic.js +122 B +56 B
dist/src/cli.js +111 B +45 B

@thymikee

Copy link
Copy Markdown
Member Author

Blocking this one because the screenshot digest does not survive the shipped client/CLI path.

The daemon view returns a digest shape with path, overlayCount, leveled overlayRefs, and artifacts. But client.capture.screenshot() still normalizes the daemon data into CaptureScreenshotResult, which only returns path plus full-geometry overlay refs. Digest overlayRefs only have ref/label, so readScreenshotResultData() drops them, while overlayCount and artifacts are ignored. The CLI screenshot command then re-emits only path/overlayRefs, so agent-device screenshot --level digest --json will not expose the digest shape this PR adds.

Please add a shipped-path test through client.capture.screenshot() or the CLI with responseLevel: digest, and either preserve raw leveled data for non-default levels or introduce a typed digest result union before the client formatter runs. The daemon-level view itself looks aligned with the Phase 4 direction, but the current tests only prove the internal view function.

The daemon screenshot view returns a leveled digest (path, overlayCount, leveled
overlayRefs, artifacts), but client.capture.screenshot() always ran the data
through readScreenshotResultData, which keeps only path + full-geometry overlay
refs and drops overlayCount/artifacts — so the digest never reached SDK/CLI
callers. Make the capture method level-aware: when the effective responseLevel
(request override or client config) is non-default, return the leveled payload
verbatim instead of normalizing it. Default-level behavior is unchanged.

Adds shipped-path client tests: capture.screenshot({ responseLevel: 'digest' })
returns the raw digest (overlayCount/artifacts preserved, no normalizer
identifiers); the default path still normalizes. Kept the return type as
CaptureScreenshotResult (the union variant cascaded into many default-path
consumers); the caller opted into the level, so the runtime value is leveled.
@thymikee

Copy link
Copy Markdown
Member Author

Fixed (pushed). You're right that the digest died at the client boundary. client.capture.screenshot() is now level-aware: when the effective responseLevel (request override or client config) is non-default, it returns the daemon's leveled payload verbatim instead of running it through readScreenshotResultData (which kept only path + full-geometry refs and dropped overlayCount/artifacts). Default-level behavior is byte-identical.

Added the requested shipped-path tests through client.capture.screenshot(): responseLevel: 'digest' returns the raw digest (overlayCount + leveled overlayRefs + artifacts preserved, no normalizer identifiers); the default path still normalizes.

On typing: I kept the return type as CaptureScreenshotResult rather than a | CommandRequestResult union — the union cascaded into many default-path consumers (every .overlayRefs/snapshot .nodes access broke), so I preserve the raw runtime data and document that for a non-default level the static type is the default shape. Note: the snapshot client path has the identical gap (from #942) — happy to do that as a focused follow-up. tsc/oxlint/fallow clean, 1113 tests pass.

@thymikee

Copy link
Copy Markdown
Member Author

Thanks, the client.capture.screenshot() side now preserves the leveled payload.

There is still one shipped-path gap from the original blocker: the CLI screenshot command re-normalizes the client result into a new object with only path and overlayRefs. With --level digest, the client now returns the raw digest, but src/cli/commands/screenshot.ts still does:

const data = {
  path: result.path,
  ...(result.overlayRefs ? { overlayRefs: result.overlayRefs } : {}),
};

So agent-device screenshot --level digest --json still drops overlayCount and artifacts, even though those are the useful digest fields added by this PR. Please add the CLI shipped-path test too and make the CLI preserve the non-default leveled payload instead of rebuilding the default screenshot shape.

agent-device screenshot --level digest --json still dropped overlayCount and
artifacts: the CLI command rebuilt the default { path, overlayRefs } shape from
the (now leveled) client result. Make screenshotCommand level-aware — for a
non-default responseLevel it emits the leveled payload verbatim (JSON / JSON
text). Adds a CLI shipped-path test (--level digest --json preserves the digest;
the default level still emits the normalized shape). Factors the predicate into
a shared isNonDefaultResponseLevel in contracts.ts, reused by the client helper.
@thymikee

Copy link
Copy Markdown
Member Author

Fixed (pushed). agent-device screenshot --level digest --json now preserves the digest: screenshotCommand is level-aware and, for a non-default responseLevel, emits the client's leveled payload verbatim instead of rebuilding the default { path, overlayRefs } shape — so overlayCount and artifacts survive. Added the CLI shipped-path test you asked for (--level digest --json preserves the digest; the default level still emits the normalized shape). Factored the check into a shared isNonDefaultResponseLevel (contracts.ts) reused by the client helper. tsc/oxlint/fallow clean, 977 tests pass.

thymikee added a commit that referenced this pull request Jun 30, 2026
client.capture.snapshot() always ran the daemon data through
normalizeSnapshotResult, which expects the full `nodes` tree — so a non-default
responseLevel digest ({ nodeCount, refs }) collapsed to an empty snapshot, the
same gap #945 fixed for screenshot. Make it level-aware: when the effective
responseLevel is non-default, return the leveled payload verbatim. Default-level
behavior is unchanged. Adds a shipped-path client test asserting the digest
survives (nodeCount/refs preserved, no normalizer identifiers).
@thymikee

Copy link
Copy Markdown
Member Author

Re-reviewed the latest fix at 397a663.

The previous CLI blocker is addressed now: client.capture.screenshot() passes non-default leveled payloads through, and the dedicated screenshot CLI handler also preserves the raw digest when --level digest is used instead of rebuilding { path, overlayRefs }. The new shipped-path test covers agent-device screenshot --level digest --json preserving overlayCount and artifacts, with a separate default-level test keeping the normalized output path.

All 21 checks are green. I do not see an actionable blocker now.

@thymikee thymikee added the ready-for-human Valid work that needs human implementation, judgment, or maintainer merge label Jun 30, 2026
… 4 (#949)

* fix: preserve the snapshot digest through the client capture path

client.capture.snapshot() always ran the daemon data through
normalizeSnapshotResult, which expects the full `nodes` tree — so a non-default
responseLevel digest ({ nodeCount, refs }) collapsed to an empty snapshot, the
same gap #945 fixed for screenshot. Make it level-aware: when the effective
responseLevel is non-default, return the leveled payload verbatim. Default-level
behavior is unchanged. Adds a shipped-path client test asserting the digest
survives (nodeCount/refs preserved, no normalizer identifiers).

* fix: preserve leveled digests through the generic CLI path

agent-device snapshot --level digest --json dropped nodeCount/refs: snapshot
goes through the generic CLI path (runCliCommandWithOutput -> formatCliOutput),
whose snapshot formatter serializes the default CaptureSnapshotResult shape and
discards digest-only fields. Make runGenericClientBackedCommand level-aware: for
a non-default responseLevel it emits the raw leveled payload verbatim (JSON /
JSON text), bypassing the default-shape formatter. This generalizes to any
generic-path command. Adds a snapshot CLI shipped-path test.
@thymikee thymikee merged commit 193deaf into main Jun 30, 2026
21 checks passed
@thymikee thymikee deleted the feat/phase4-screenshot-digest-view branch June 30, 2026 08:46
@github-actions

Copy link
Copy Markdown
PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-06-30 08:46 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-human Valid work that needs human implementation, judgment, or maintainer merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant